Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Draft] Add Web API for MarkItDown #202

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

vs4vijay
Copy link
Member

Related to #133

@vs4vijay
Copy link
Member Author

In Testing

@vs4vijay vs4vijay changed the title Add Web API for MarkItDown [Draft] Add Web API for MarkItDown Dec 22, 2024
@xingxingcan
Copy link

Thank you very much, we need api interface access in order to utilize automated processes, we are very optimistic about this project, thanks again to the great developers.

@homanp
Copy link

homanp commented Jan 3, 2025

I deployed a simple REST API, ya'll can try it out. Here are the specs:

https://x.com/pelaseyed/status/1872326539208986903

Copy link

@GerardSmit GerardSmit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to run the API locally through Docker. I had to change the following things to make it work.

@@ -12,12 +12,12 @@ RUN apt-get update && apt-get install -y --no-install-recommends \
ffmpeg \
&& rm -rf /var/lib/apt/lists/*

RUN pip install markitdown
RUN pip install markitdown fastapi uvicorn

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
RUN pip install markitdown fastapi uvicorn
RUN pip install markitdown fastapi uvicorn python-multipart

Without python-multipart the API won't start.

It's still weird that markitdown is being installed here. This means if the source gets changed and you want to check if the Docker image works you first need to publish the package to pip... It would be better if the current src folder would be included here.


# Default USERID and GROUPID
ARG USERID=10000
ARG GROUPID=10000

USER $USERID:$GROUPID

ENTRYPOINT [ "markitdown" ]
ENTRYPOINT ["uvicorn", "src.markitdown.api:app", "--host", "0.0.0.0", "--port", "8000"]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't work: src/markitdown/api.py doesn't exists, so the API will not start.

You must copy the src directory:

Suggested change
ENTRYPOINT ["uvicorn", "src.markitdown.api:app", "--host", "0.0.0.0", "--port", "8000"]
COPY src /src
ENTRYPOINT ["uvicorn", "src.markitdown.api:app", "--host", "0.0.0.0", "--port", "8000"]

Note that in .dockerignore that everything is ignored, so the COPY won't work.
Modify the .dockerignore so the src/-folder gets included:

*
!/src

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants